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Recognition  of  Clauses  and  Phrases  in  Machine  Translation 

of  Languages . * 


The  process  described  here  is  intended  to  be  used  as  part  of  the 
system  of  mechanical  translation  developed  /by  Ida  Rhodes  of  the  National 
Bureau  of  Standards,  as  outlined  e.g»  in  i It  is  assumed  that  the 
reader  is  familiar  with  i/  and  in  particular  with  the  terminology 
used  there#  The  present  report  may  be  considered  as  a more  detailed 
esjposition  of  one  section  of  i/,  namely  the  one  dealing  with  the 
establishment  of  a "Temporary  Profile."  In  terms  of  the  contemplated 
machine  routine  this  is  Part  II-A. 

The  grammatical  information  given  here  is  based  largely  on 
^ and  V. 

It  was  originally  planned  to  establish  the  boundaries  of  claus  es 
and  phrases  within  a sentence  in  the  course  of  analyzing  the  syntactic 
role  of  each  word  in  the  sentence  (Part  II-B  of  the  contemplated  routine). 
The  idea  to  accomplish  this  part  of  the  work  in  a separate  part  of  the 
program  arose  from  a comment  communicated  to  us  by  Mr.  M.  Sherry  of  the 
AF  Cambridge  Research  Center.  It  was  enriched  by  numerous  suggestions 
from  Ida  Rhodes,  Leroy  Meyers  and  Richard  See.  In  particular,  the 
author  is  indebted  to  Mr.  See  for  the  list  of  prepositions  in  Appendix  II. 

It  is  expected  that  the  temporary  profile  will  be  established 
through  an  iterative  process  which  scans  forward  and  backward  in 
altematicn,  perhaps  many  times,  with  relatively  few  commands.  It 
seems  preferable  to  do  this  separately  from  Part  II-B  because  the 
latter  is  likely  to  have  a large  number  of  commands  and  fewer  iterations . 


* This  work  was  sponsored  by  the  Office  of  Ordnance  Research,  Department 
of  the  Army. 


The  Temporary  Profile. 


The  object  of  this  part  of  the  code  is  to  assign  to  each  occurrence 
a set  of  three  numbers  ^ P,  b,  called  the  clause  number,  phrase  number, 
and  backward  signal. 

Of  these,  C,  the  clause  number,  starts  with  0 for  each  sentence  and 
numbers  clauses  in  the  order  of  their  beginning  within  the  sentence: 

C=0,  1,  2,  ...,  7*  (in  hand  work  often  C=l,  2,  ...8).  For  each  clause 
we  store  a status  symbol,  v,  which  has  the  following  meaning:  v=0, 

the  predicate  of  this  clause  has  not  yet  been  found;  v=2,  the  predicate 
has  been  found;  v=l,  a possible  predicate  has  been  found.  Thus  v starts 
with  0 in  each  clause  anew.  When  a finite  verb  is  encountered,  v is  set 
= 2 for  that  clause.  If,  say,  a short  adjective  which  may  also  be  an 
adverb  (l.e.,  is  in  the  neuter  singular)  is  encountered  while  v=0,  we 
set  v=l;  if  subsequently  a finite  verb  is  encountered  in  the  same 
clause,  we  set  v=2  from  there  on.  Finally,  when  the  end  of  the  whole 
sentence  is  reached,  we  check  for  clauses  which  have  not  yet  reached 
v=2  (i.e.  we  have  found  no  definite  predicate).  If  such  a clause  has 
v=l,  we  change  it  to  v=2  (i.e.  we  accept  the  possible  predicate  as 
predicate);  if  a clause  ends  in  v=0,  there  is  either  an  error  in  the 
profile  or  an  implied  predicate.  In  all  the  foregoing  we  have  used 
finite  verbs  and  certain  short  adjectives  as  prototypes  for  definite 
predicates  and  possible  predicates.  There  are,  however,  many  others, 
as  will  be  specified  by  the  code. 

The  clause  number,  C,  is  = 0 for  the  first  occurrence  and  all 
subsequent  ones  until  an  indication  of  the  beginning  of  a new  clause 
(a  ''clause  opener",  or  C.O.)  is  encountered.  From  there  on  to  the 
next  C.O.,  we  have  C=l.  However,  the  first  clause  (C=0)  may  not  yet 
be  completed;  it  may  be  interrupted  and  resumed  later.  Therefore,  at 
every  indication  of  a changing  C,  we  must  decide  whether  to  start  a 
new  clause  (increase  C)  or  resume  an  earlier  incomplete  one  (decrease  C ) . 

In  this  decision  we  use  the  following 

Con j ecture : If  clause  B starts  after  clause  A,  then  clause  A 

does  not  resume  before  clause  B is  completed. 

Stated  differently,  if  words  (or  groups  of  worts)  of  clauses  A 
and  B are  denoted  by  a and  b respectively,  we  may  find  the  arrangement 
a. . .b. , .b. . .a,  (where  B is  e^nclosed  within  A),  but  we  cannot  have  the 
arrangement  a...  b ...a  ...b.  (where  the  dots  ...  may  stand  for  words 
of  A,  B or  other  clauses). 

In  line  with  this  conjecture,  we  propose  the  following 

Rule:  a)  Whenever  a change  in  C is  Indicated,  and  in  addition  a clause 

opCner  follows,  the  new  C is  made  higher  by  one  than  the  highest  C used  so 
far  in  this  sentence  (i.e.  C +1,  where  0 is  the  "monitoring  clause  number" 
defined  below).  ^ ^ 
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b)  When  a change  in  C is  indicated  without  a new  clause  opener, 
and  the  most  recent  clause  is  possibly  complete  (v=l  or  v=2),  then  we 
return  to  the  most  recent  Incomplete  clause,  i.e.,  ve  set  the  new  C 
equal  to  the  highest  C previously  used  for  which  v=0  (i.e.,  equal  to 
the  quantity  C defined  below) . ^If  v > 0 for  all  previous  C,  then  we 
start  a new  clause  as  under  (a).  ’ 

c)  When  a change  in  C is  indicated  without  a new  clause  opener, 
and  the  most  recent  clause  is  incomplete  (v=0),  then  we  start  a new 
clause  as  \mder  (a). 

The  conjecture  stated  above  can  be  sharpened.  The  case  of  a 
clause  B being  "nested  into"  a clause  A,  i.e.  B beginning  after  A 
begins  but  before  A is  completed,  cannot  occur  if  A is  subordinate 
to  B.*) **'  (Most  likely  it  cannot  occur  when  A and  B are  coordinated 
or  unrelated,  either.  The  only  common  case  is  that  in  which  B,  the 
nested  clause,  is  subordinate  to  A) . 

In  general  we  shall  make  no  use  of  this  sharper  form  of  the 
conjecture,  since  our  aim  is  only  to  determine  the  beginning  and  end 
of  each  clause,  not  the  relative  roles  of  subordinate  and  super- 
ordinate clauses.  We  shall,  however,  need  it  in  connection  with  main 
clauses.  While  the  beginning  of  other  clauses  can  usually  be  recognized 
by  a clause  opener  (a  conjunction,  relative  or  interrogative  pronoun  etc.), 
the  start  of  the  main  clause  has  usually  no  such  distinctive  mark.  This 
makes  it  difficult  to  ascertain  the  start  of  the  main  clause  in  a 
sentence  in  which  the  main  clause  is  preceded  by  one  or  more  subordinate 
clauses.  On  the  other  hand,  the  recognition  of  the  main  clause  in  such 
a sentence  is  helped  by  the  knowledge  that  the  main  clause  cannot  be 
nested  into  any  other  clause.  That  is  to  say,  if  the  main  clause  is 
not  the  first  clause  in  a sentence,  then  it  cannot  begin  until  all 
preceding  clauses  are  completed. 


*)  We  are  thus  using  the  existence  of  a predicate  or  possible  predicate  as 
the  only  criterion  for  completion  of  a clause.  This  may  be  misleading; 
better  criteria  will  be  found  in  the  course  of  Part  II,  Section  B of  the 
program,  where  the  Temporary  Profile  will  be  systematically  revised. 

**)A  possible  exception  is  the  case  where  B is  an  incidental  or  paren- 
thetical clause  such  as  "I  think"  or  "it  is  well  known" . In  such  a 
case  the  bracketing  clause  may  be  interpreted  as  subordinate.  It 
has  the  form  of  a main  clause  but  the  meaning  of  a subordinate  one. 


This  can  he  taken  care  of,  without  change  in  our  general  plan, 
by  the  following  artifice.  If  a sentence  begins  with  a subordinate 
clause  (as  indicated  by  a subordinate  clause  opener)  we  assign  to 
this  clause  the  number  C=l,  rather  than  C=0.  We  fictitiously  assign 
C=0  to  the  main  clause  as  if  it  started  the  sentence  without  a word. 

Thus,  if  subsequently  the  end  of  a clause  is  reached  according  to 
case  (b)  above,  and  if  all  intervening  clauses  have  been  completed 
also,  the  new  clause  number  chosen  will  be  automatically  C=0  — this 
playing  the  role  of  the  "most  recent  incomplete  clause." 

The  phrase  number,  P,  counts  phrases  within  each  clause.  P=0 
means  that  the  current  occurrence  is  not  part  of  any  phrase.  P=l^  2,  ...6 
count  phrases  in  order  of  their  starts.  The  start  of  a phrase  is  indi- 
cated by  the  appearance  of  a "phrase  opener",  P.O.,  such  as  a preposition, 
participle  etc.  The  end  of  a phrase  is  usually  not  determined  at  this 
stage.  We  assign  the  new  phrase  number  to  the  P.O.  and,  in  a prepo- 
sitional phrase,  to  the  next  occurrence  (since  such  a phrase  has  at 
least  two  words),  thereafter  we  use  the  symbol  P=7  (in  hand  work  often 
P=x)  to  indicate  that  the  occurrence  belongs  either  to  the  last  phrase 
or  to  no  phrase  at  all;  this  is  continued  up  to  the  next  punctuation 
mark,  predicate,  C.O.,  P.O.,  or  up  to  any  other  occurrence  which  cannot 
be  part  of  the  phrase.  (For  a P.O.  following  immediately  after  another 
P.O.,  P is  raised  a second  time,  so  that  the  first  P is  assigned  only 
to  the  P.O.  Itself.) 

The  backward  indicator,  b,  is  normally  = 0,  and  is  set  = 1 only 
in  certain  circumstance^  some  of  which  are  defined  below.  If  b=l 
it  indicates  that  the  current  occurrence  must  be  linked  with  an  earlier 
one  through  a procedure  other  than  the  usual  predictions,  and  this 
information  will  be  utilized  in  Part  II-B  of  the  program. 

In  addition  to  C,  P,  b for  each  word  in  a sentence,  the  following 
numbers  are  generated  and  kept  up  to  date: 

For  each  clause  -- 

the  status  symbol  v,  already  discussed; 

the  highest  phrase  number  used  so  far  ("monitoring  phrase 
number" ) P . 

For  the  sentence  as  a whole  — 

the  highest  clause  number  used  so  far  ("monitoring  clause 
number")  C ; 

the  highest  number  of  an  Incomplete  clause  (i.e.  a clause 
with  v=0)  used  so  far,  C . 


- 5 - 


Clauses . 


Before  establishing  detailed  rules  for  the  mechanical  recog- 
nition of  the  beginning  and  end  of  each  clause  and  each  phrase 
occurring  in  a given  Russian  sentence,  it  is  advisable  to  obtain  a 
bird's  eye  view  of  the  kinds  of  clauses  and  phrases  existing  in  the 
Russian  language. 

Conventional  grammar  distingushes  about  a dozen  kinds  of  clauses, 
each  with  its  own  characteristics.  In  enumerating  these  below,  we  do 
not  imply  that  a computing  machine  should  be  programmed  to  draw  all 
the  distinctions  among  them.  Such  refined  analysis  would  be  quite 
difficult  and  of  little  use.  We  do  think  that  at  some  future  time 
it  may  be  both  feasible  and  useful  for  a computer  to  differentiate 
between  coordinate  and  subordinate  clauses  and  perhaps  also  among  the 
three  major  classes  of  subordinate  clauses.  For  the  time  being, 
however,  our  sole  aim  is  to  let  the  computer  recognize  a clause  as 
a clause,  and  find  its  beginning  and  end.  Our  object  in  enumerating 
the  types  of  clauses  in  this  report  is  to  indicate  the  different 
grammatical  constructions  which  we  may  expect  to  encounter  in  Russian 
text,  the  types  of  clause  openers,  forms  of  the  predicate,  etc. 
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Main  and  Coordinate  Clauses 


The  simplest  periods  consist  of  a main  clause  and  one  or  more 
subordinate  clauses.  A subordinate  clause  may  depend  either  on  the 
main  clause  or  on  another  subordinate  clause.  Furthermore^  there 
exist  coordinate  clauses,  in  parallel  either  with  the  main  clause 
or  with  a subordinate  clause. 

A coordinate  clause  in  parallel  with  the  main  clause  may  begin 
with  one  of  the  conjunctions  "and",  "or",  "nor",  "but"  (H,  HJIH,  HH,  A,  HO)^ 
or  such  parallel  clauses  may  be  merely  separated  by  a comma.  A coor-* 
dlnate  clause  in  parallel  with  a subordinate  one  begins  in  the  same 
way,  and  in  addition  the  clause  opener  of  the  subordinate  clause 
may  ot  may  not  be  repeated.  In  all  these  respects  Russian  is  like 
English,  and  it  will  suffice  to  give  English  examples;  ' « 

Congress  approved  the  bill  and  (but)  the  President  vetoed  it. 

We  must  hang  together  (£_)  or  we  shall  hang  s^arately. 

United  we  stand^  divided  we  fall. 

All  will  be  well  if  Congress  passes  the  bill  and  the  President 
signs  it 

or 

All  will  be  well  if  Congress  passes  the  bill  and  if  the 
President 

The  same  coordinating  conjunctions  are  used  to  join  words  rather 
than  clauses: 

Congress  and  the  President  favor  the  bill* 

They  may  also  join  clauses  which  have  the  subject  (or  more 
rarely  the  predicate  or  object)  in  common,  in  which  case  the  common 
part  is  usually  not  repeated: 

The  President  signed  the  bill  providing  for  ....  and  declared 
in  his  message...  , 

Congress  prosed,  and  the  President  signed,,  the  bill  providing .. . 

These  may  be  considered  as  "elliptic  clauses"  in  which  some  sentence 
element  is  missing  and  is  understood  as  being  repeated  from  another 
occurrence  in  the  same  period.  The  case  of  a composite  subject  (or 
predicate)  may  be  inteipreted  as  an  elliptic  coordinate  clause. 
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The  case  of  a connna  separating  two  coordinate  clauses,  as  in  the 
exati5)le  above  (United**..),  is  rare  in  English,  though  raord  frequent  in 
Russian.  Commas  are  used  regularly  when  there  are  more  than  two 
parallel  clauses.  We  call  these  cases  "listings”.  The  "last  member 
of  a listing  is  set  off  by  and  or  or  (h  or  HJM  ) without  h comma 
Xih  Russian): 

The  committee  released  the  blllj^  the  Senate  passed  it  and 
the  President  signed  it  into  law. 

Note  that  in  English,  especially  in  American  usage,  the  final  and 
(or)  in  a listing  is  often  preceded  by  a commaj  not  so  in  Russian.  ‘ 

Subordinate  Clauses 

With  minor  variations,  the  following  grouping  of  subordinate 
clauses  is  accepted  by  many  authors. 

a)  Noun  Clauses: 

1.  Interrogative 

2.  Declarative  , 

3*  Imperative,  Optative 

b)  Adjectival  Clauses: 

il-.  Relative 

c)  Adverbial  Clauses: 

5*  Ten^joral  . 

6.  Locative 

7 . Conditional 

j 8.  Causal 

9 • Comparative 

10.  Final 

11.  Consecutive 

12.  Concessive  ‘ 

13 . Modal 


Noun  clauses  usually  replace  a noun  in  the  nominative  or  accusative 
case,  i.e.  they  may  sei*ve  in  place  of  a subject,  predicate  nominative, 
direct  object,  or  apposition.  They  can  fulfill  the  predictions  for 
these  sentence  elements. 


Adjectival  clauses  serve  syntactically  in  the  same  role  as 
adjective  modifiers. 
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Adverbial  clauses  generally  taJce  the  place  of  an  adverb  in  the 
superordinate  clause.  "I  shall  return  ^pn  --  I shall  return  vhen  the 
rain  stops.”  For  the  most  part  these  clauses  begin  vith  a character-  ' 
Istlc  conjunction,  by  which  they  can  be  recognized.  Like  adverbs,  they 
are  "unpredictable"  in  our  syntactic  analysis . 

1.  Interrogative  clauses  (indirect  questions)  are  characterized  by 
an  interrogative  pronoxm,  interrogative  adverb  or  other  particle.  He 
asked  whom  we  would  meetj  . . . when  we  would  go;  ...  whether  (if)  we 
would  retumj  pDouble  (multiple)  questions ;~1  He  asked  whether  it  would 
rain  or  snow  or  be  sunny. 

if,  in  Russian  an  interrogative  pronoun  or  adverb  stands  at  the 
beginning  of  the  subordinate  clause,  it  is  immediately  preceded  by  a 
comma.  .On  the  other  hand,  an  interrogative  pronoun  or  particle  may 
be  preceded  by  other  words,  e.g.  prepositions,  which  are  part  of  the 
same  clause;  They  ask  for  whom  the  bell  tolls.  (When  such  a pronoun 
("delayed  clause  opener")  is  encountered,  Ue  raise  the  clause  number 
for  this  occurrence  and  for  all  the  previous  ones  up  to  the  most  recent 
comma.) 

There  are  interrogative  "adjectival  pronouns"  (or  "pronominal 
adjectives")  e.g.  what  price  freedom?  He  asked  in  which  room  we 
should  stay. 

NB;  Beware  of  the  prejudice  that  interrogative  clauses  follow 
only  verbs  with  an  interrogative  meaning;  e.g.  "I  decide  whom  we 
shall  see"  is  an  interrogative  clause,  though  no  question  is  raised. 
Sometimes  interrogative  clauses  are  divided  by  a hairline  from  relative, 
conditional,  or  tenqjoral  clauses. 

Syntactically,  an  interrogative  clause  frequently  takes  the  place 
of  a direct  object.  Thus,  if  the  main  clause  has  a transitive  predicate, 
which  predicts  an  accusative  complement  with  high  urgency,  this  pre- 
diction may  be  considered  satisfied  when  an  interrogative  clause  is 
encountered.  An  interrogative  clause  may,  however,  take  the  place  of 
the  subject  or  of  a predicate  nominative.  In  general,  the  interrogative 
clause  replaces  a noun  in  the  nominative  or  accusative  case. 

■ To  the  English  interrogative  particle  "whether"  corresponds  the 
Russian  JH  which  is  placed  after  the  first  word  of  the  clause. 

2.  Declarative  clauses  begin. with  HTO  after  a comma.  There  is 
therefore  no  ambiguity  about  recognizing  them  as  clauses. 


Like  interrogative  clauses  they  usually  take  the  ^lace  of  a 
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direct  object,  e.g.  I heard  the  news — I heard  that  the  French  government 
.._.fell*5#  Again  like  Interrogative  clauses,  they  may  also  take  the  place 
of  the  subject  or  of  a predicate  nominative  or  apposition.  ’’That  the 
government  fell  was  unexpected"  (Subject).  "Our  hope  Is  that  the., govern- 
ment may  survive"  (Predicate  nominative). 

There  are  verbs,  like  BOHTbCH  (fear),  which  take  a noun  object  in 
the  genitive,  but  a clause  object  beginning  with  htO  like  transitive 
verbs.  In  other  words,  these  declarative  clauses  take  the  place  of-  a 
compleraent  in  a case  other  than  the  accusative. 

Instead  of  MTO  the  declarative  clause  may  begin  with  HTOB,  HTOBIj  “ 
or  KAK.  Those  beginning  with  KAK  are  close  to  interrogative  clauses. 

3.  Imperative  and  optative  clauses  have  the  same  form  as  declarative 

. ones. 

* 

Relative  or  adjectival  clauses  are  characterized  by  a relative 
pronoun  or  relative  adverb . '*6ur  father  who  art  ..."  "The  place  where 

he  stands..."  Just  as  in  interrogative  clauses,  if  the  relative  pronoun 
or  adverb  stands  at  the  beginning  of  the  clause,  it  is  preceded  by  a 
comma,  (in  this  respect  Russian  is  unlike  English. ) On  the  other  hand, 
a relative  pronoun  may  be  preceded  by  other  words  which  are  part  of  the 
same  clause,  e.g*  a preposition;  "The  ground  on  which  he  stands..." 

(In  case  of  such  a "delayed  clause  opener",  the  clause  number  is  treated 
as  in  the  corresponding  case  of  interrogative  clauses.) 

Many  relative  pronouiis  and  adverbs  are  homonyms  of  interrogative 
pronouns  and  adverbs,  in  Russian  Just  as  in  English;  e.g.  I wonder  who 
c^e  (interrog.),  I saw  the  men  who  came  (relative).  KTO,  HTO,  KOTOPbIM 
are  examples.  ' 

Relative  clauses  quite*  generally  play  the  role  of  adjectives. 

The  relative  pronoun  (adverb)  has  an  antecedent  in  the  main  (or  super- 
ordinate)  clause,  usually  a noun  or  a demonstrative  pronoun.  This 
antecedent  is  modified  by  the  relative  clause.  Just  as  a noxm  is 
modified  by  an  adjective. 

5»  Temporal  clauses  are  indicated  e.g.  by  KOPdA  "when";  KAK,  JMJIB, 
KAK  TOJIBKO,  AMB  TOJIBKO  ^'as  soon  as";  nOKA  HE  "as  long  as";  ,nOKyAA  HEj 
riEPEfl  TEM,  4T0  "before";  nOCJE  TOPO,  HTO  "after";  HPEJKdE’ TOPO,  4T0; 
dO  TOPO,  4T0.  ...  English  conjunctions;  when,  as,  while,  whenever,  before, 

until,  after,  since. 


*)"I  hea,rd  why  the  French  government  fell"  is  an  interrogative  clause. 
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6.  Locative  clauses  begin  e.g.  with  n/TE ’'■where",  "wherever".  KfM. 
"whither'*  OTlwtA  "whence" . 

7*  Conditional  clauses^  begin  typically  with  EC.TH  "if"j  also 
ECJM  EH  if  the  hypothesis  is  unreal;  similarly  JHUIE,  JMUIE  BHj  also 
KOnU  EH  or  KAEH  "if  only",  and  others. 

The  predicate  may  be  a finite  verb  or  an  infinitive. 

8.  Causal  clauses  begin  with  HEO  (because);  in  Russian  this  is 
often  replaced  by  an  antecedent -consequent  pair  like  nOTOMy,  HTO 
(note  comma  before  HTO  ) and  a number  of  similar  phrases:  OT  TOFO,  4T0| 

3A  TO,  HTO;  TEM,  HTOj  HJlFi  TOFO,  4T0.  Similar  constructions  but 
without  causal  meaning  are  B TOM,  4T0  "in  that",  HE  TO,  MTO  "not  that"* 
Intermediary  between  temporal  and  causal  is  PA3  "once"; 

9*  Comparative  clauses  begin  with  4EM  "than";  also  HE)KEJIH, 
occasionally  KAK.  Since  HEM  has  other  meanings  too,  these  clauses 
require  careful  handling.  They  are  often  recognizable  by  the  previous 
occurrence  of  a comparative  adjective  or  adverb.  In  other  cases  a HEM- 
clause  is  followed,  rather  than  preceded,  by  a comparative;  in  still 
other  cases  it  is  followed  by  a main  (super ordinate)  clause  beginning 
with  TEM. 

However,  the  HEM  after  a comparative  may  not  introduce  a clause 
at  all  but  merely  a word.  Also,  the  conjunction  HEM  may  follow  after 
any  kind  of  main  clause,  with  the  meaning  "rather  than". 

10..  Final  clauses  are  indicated  by  HTOBH,  HTOB  "so  that,  in 
order  that".  The  predicate  is  in  the  perfect  tense  or  in  the  infinitive. 

11.  Consecutive  clauses  are  indicated  by  TAK  HTO  "such  that,  in 
such  a way  that" . The  predicate  is  a. finite  verb  or  an  infinitive. 

A special  difficulty  lies  in  abbreviated  infinitive  constructions 
with  consecutive  meaning:  BH  MOJIOZIH  CyZLHTE  "you  are  (too)  young  to 

judge". 

12.  Concessive  clauses  may  begin  with  XOTH  or  XOTE,  XOTH.H  or 
XOTE  instead  we  may  find  HECMOTPH  HA  TO  HTO;  ZIAPOM  HTO  or  HOTU 
HET  HTO,  all  of  which  may  be  translated  "although"  or  "no  matter  how" . 

13.  Modal  clauses,  like  the  English  ones  beginning  with  KAI^,  KAK 
EyZlTO,  HKOBH  ( "as",  ^s  if",  "as  though"). 


*)  Note:  "I  shall  go  where  you  are  going"  --  adverbial  (locative)  clause* 

"I  know  where  you  are  going" — noun  (interrogative)  clause, 

"The  place  where  you  are  going  is  near" — adjectival  (relative)  clause. 
Similar  ambiguities  occur  with  ten5>oral  clauses  and  others.  In  other  words, 
many  of  the  subordinating  conjunctions  are  homonyms  of  interrogative  and/or 
relative  pronouns  or  adverbs . . 
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Phrases . 


We  consider  principally  three  kinds  of  phrases:  (a)  pr^ositional; 
(h)  adjectival,  participial,  gerundive,  apposltive;  (c)  incidental. 

A prepositional  phrase  is  unmistakahly  ^ recognlzahle  by  the  fact 
that  it  starts  vlth  a preposition.  The  prepositions  can  be  exhaustively 
listed.  The  phrase  may  or  may  not  be  enclosed  by  commas;  in  general, 
the  longer  it  is,  the  more  likely  is  it  to  have  commas . As  a first 
approximation,  if  a preposition  is  preceded  by  a comma,  we  may  expect 
the  next  comma  to  mark  the  end  of  the  prepositional  phrase.  There  are 
exceptions,  however.  The  first  comma  may  have  a different  function 
(e.g.  closing  a preceding  clause)  and  the  prepositional  phrase  may  end 
before  the  next  comma: 

If  he  helps  you  today,  in  truth  he  must  be  your  friend. 

Or  a comma  may  occur  within  the  prepositional  phrase: 

A true  friend  helps,  without  long,  detailed  and  searching  questions, 
to  the  best  of  his  ability. 

The  usefulness  of  this  discussion  lies  more  in  accounting  for 
commas  than  in  finding  the  end  of  a phrase.  Since  in  the  commonest 
case— that  of  a phrase  of  three  or  four  words  not  enclosed  by  commas-- 
the  end  cannot  be  found  with  the  limited  information  available  at  the 
time  of  profiling,  we  might  as  well  leave  undecided  the  end  of  other 
prepositional  phrases.  The  rule  for  the  profile  will  be  that  in  case 
of  doubt  only  the  preposition  and  the  following  word  are  definitely 
in  the  phrase;  subsequent  words  are  left  undecided  up  to  the  next 
clear  start  of  a new  clause.  The  final  determination  of  the  end  of 
the  phrase  will  be  made  In  Part  IX-M  of  the  machine  program,  by 
deciding  whether  the  subsequent  words  satisfy  predictions  made  within 
the  phrase. 

Even  this  procedure  may  tend  to  close  the  phrase  prematurely. 

There  are  prepositional  phrases  within  prepositional  phrases,  and  even 
clauses  within  prepositional  phrases.  For  instance,  in 
...  in  the  case  that  negotiations  fail  . . . 
the  subordinate  (declarative)  clause  "that  ..."  is  part  of  the  pre- 
positional phrase  beginning  with  "in". 


*)  Note,  however,  that  prepositions  are  often  homonymous  with  adverbs 
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An  adjectival  (participial,  gerundive)  phrase  is  almost  always 
enclosed  between  commas.  Its  principal  element  is  an  adjective 
(participle,  gerund)  which  refers  to  a noun  or  nominal  (antecedent) 
outside  the  phrase.  The  antecedent  (frequently)  comes  before  the 
phrase.  The  other  elements  of  the  phrase  are  dependent  on  the  ad- 
jective (paxtidlple,  gerund)  and,  in  the  subsequent  Part  II-B,  will 
be  predicted  by  it  in  the  usual  way.  The  phrase  Itself  is  considered 
unpredictable. 

fu 

The  defense  rests,  secure  in  the  e^ectation  of  victory. 

liOQking  forward  to  seeing  you,  we  remain  ... 

If  the  principal  element  is  an  adjective  or  participle,  it  agrees 
with  the  antecedent  in  case,  number  and  gender.  To  help  estp,blish 
this  agreement  in  Part  II-B,  a "backward  flag"  is  attached  to  the  ad- 
jective or  participle. 

Occasionally  the  pivotal  word  is  an  adverb  rather  than  ah  adjective: 

The  bill  received  little  attention  relative  to  Ita  importance. 

' The  case  was  decided  similarly  to  a precedent, 

(in  the  first  of  these  examples,  "relative"  is  an  adverb,  though  it  has 
the  form  of  an  adjective.  It  modifies  the  adjective  "little"'.  In 
English  it  is  optional,  in  cases  like  thi^,  to  use  the  adjectival 
form  in  place  of  the  adverbial  one,  but  in  Russian  the  adverb, 

OTHOCHTEJIbHO  is  used  necessarily.)  There  is  no  agreement  of  the 
adverb  with  any  antecedent,  and  therefore  no  "backward  flag." 

Appositive  phrases  are  not  recognized  by  phrase  openers.  A noun 
immediately  following  a comma  is  sometimes  an  indication  of  such  a phrase. 

Incidental  phrases  likewise  do  not  have  phrase  openers.  E3q)ressions 
like  noarOM/,  OHEBUAHO,  TEM.HE  MEHEE,  which  frequently  appear  between 
ccmmas,  are  marked  "incidental"  in  the  glossary. 
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Appendix  I 


Clause  Openers. 

The  list  helow  contains  the  more  frequent  words  by  which  the  start 
of  a new  clause  may  frequently  be  recognized.  They  are  (a)  the  coordinating 
conjimctions  A,  H,  HH,  HOj  (b)  interrogative  and  relative  pronouns 

and  adjectivesj,  KOTOPtlM,  KTO,  4EH,  HTO,  KAKOh,  KAftOBOH,  and  their  in- 
flected forms]  (c)  a large  number  of  subordinating  conjunctions'.  Some 
of  the  latter  are  also  used  as  interrogative  or  relative  adverbs^  e.g.  PZLE. 

Most  of  these  stand  at  the  beginning  of  the  clause^  preceded  by 
a comma^  but  some  can  be  delayed^  notably  the  pronouns  and  the 
conjunction  JJH, 

To  some  English  conjunctions  correspond  composite  expressions  in 
Russian  which  may  be  considered  as  idioms,  e.g.  OT  TOPO,  4T0. . . , 
literally  "from  this,  that...",  could  be  treated  as  an  idiom  and 
translated  by  the  single  word  "because".  Many  more  such  cases  occur, 
and  it  is  somewhat  arbitrary  how’ many  are  to  be  listed  as  idioms 
representing  conjunctions,  and  how  many  are  to  be  translated  literally. 

In  this  connection  we  mention  the  divided  idioms,  such  as  . . , 4EM, 

which  we  may  treat  as  "antecedent-consequent"  pairs. 

Some  of  the  conjunctions  are  also  used  as  adverbs,  so  that  they 
cannot  be  taken  as  sure  signs  of  the  beginning  of  a clause:  e.g. 

PAS  = once  (=as  soon  as)  or  (=  at  one  time).  Also,  the  interrogative 
pronouns  can  occur  in  main  clauses,  instead  of  as  openers  of  subordinate 
clauses.  In  many  cases  the  occurrence  of  a comma  immediately  before 
one  of  these  doubtful  clause  openers  indicates  that  a new  clause  is 
indeed  being  opened. 


A 

B TOM,  4T0 
TOE 

/3AP0M,  4T0- 
JiM  TOPO,  4T0B 
HO  TOPO,  (or  htOE) 
ECAH|  EGJIHB 

3A  TO,  4T0 
H 

HBO 

m 

KAK 

KAK  By/ITO 
KAK  TOJltKd 
KAKOB 


■ KAKOB 

KAitOBOM  (Possibly  delayed) 
KAKOK . ( " " ) 

KOniA,*  KOPZIA...BLI 
KOTOPti  and  inflected  forms 
(possibly  delayed) 

KTO  and  inflected  forms 
(possibly  delayed) 

Ky/IA 

JM  (always  delayed) 

MB  TOJIBKO 
HE  TO,  4 TO 
HEiKEJM 
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HECMOTPH  lU  TO,  4T0 

C TEX  HOP,  KAK 

HH 

GKOAEKO  and  inflected  forms 

HO  . 

TAK  4T0 

Hyauibl  HET,  MTOEU 

TEM,  4T0 

OT  TOrO,  4T0 

XOTE;  XOTE  H;  XOTH^  XOTH  M 

OTKyZlA' 

4EM  and  inflected  forms 

HEPEfl  TEM,  KAK  ( 

4T0ELI) 

(possibly  delayed) 

nOM^  nOM  . . . • HE 

HEM  (after  comparative) 

nOK^'ZlA  ...  HE 

4X0  (interrogative  or  relative 

nOCKOJlH'O^ 

pronoun,  with  inflected  forms 

nocjiE  Toro,  kak  ( 

4 TO) 

possibly  delayed) 

nOTOMy,  4T0 

4T0  (conjunction  - not  declined) 

noHEiviy 

^[TOB,  4T0EU 

IIPEHOIE  4EMj  HPEDK/IE 

Toro, 

mOBLI 

4T0EH 

PA3 
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Russian  Prepositions 

. K' 

BE3 

K 

H0-3A 

BE30 

KO. 

+n03AdH 

BJM3 

¥Pom 

n03Adb 

, B 

o+i^pyroM  (Kpyr) 

nOMHIv^O 

+BBJIH3H 

0 ME)K  (MEJFA) 

nO-HAd 

iviE/£dy 

+noriEPEK 

+BaOJIB 

. ■flidliMO- 

IIOCEPEdHHE 

B3AMEH 

HA 

+nocJiE 

BfffiCTO 

+IiADCTPEMy 

+nOCPEdH 

BHE 

mi 

nOGPEdHHE 

♦BHIdSy 

0 imo 

nOCPEdCTBOM 

+BHyTPH 

+HAKAHyHE 

nPEd 

+BHyTPb 

+HA.riEPEKOP 

nPMo 

BO 

HAHO/IOEHE 

•HlPEFdE 

B03JIE 

+HiUlPOTHB 

nPH 

+BOKpyr 

EACHiiT 

npo 

BOnPEKH 

4-IiEBnPHMEP 

nPOTHB 

■•■BnEPEdH 

0 

o^nmbu  (nyTb) 

■►BPOZIE 

OB 

PAdH 

BCHJiy 

OBO 

c 

+BCJIEd 

♦OKOJIO 

■CDEPX, 

BGJIEZICTBHE 

OT 

+CBMUE 

HJUi 

OTO 

•»-C3AdH 

HO 

0 IIEPEA  (IIEP&I) 

CKB03b 

3A 

HEPEdO 

CO 

H3 

no 

CHEPFdH 

N30 

nOBEPX 

CPEdH 

H3-3A 

noa 

y 

H3-n0d 

+nodJiE 

nodo 

HEPE3 

Note:  "+"  indicates  that  the  preposition  may  also  serve  as  an  adverb. 

"o*'  indicates  that  the  preposition  is  homographic  vith  foms  other 
than  adverbs.  The  dictionary  form  of  the  homograph  is  given  in 
parentheses  when  it  is  different  from  the  preposition.  The  pre- 
positions consisting  of  a single  letter  may  be  homographic  with  letters 
of  the  Cyrillic  alphabet,  or  even  with  letters  of  the  Latin  alphabet. 
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Appendix  III 

Characteristics  of  occurrences  to  be  used  in  profiling 

("Profile  skeleton") 

1.  Clause  openers 

coordinate 

subordinate 

accepting  infinitive  as  predicate 

elliptic  (subject  and/or  predicate  of  clause  may  be  omitted) 
delayed 

2.  Phrase  openers 

prepositions 

participles  (long  form),  after  comma 
gerunds 

adjectives  (long)  and  adverbs  which  can  be  heads  of  adjectival  or 
adverbial  phras es - -mainly  those  with  case  government,  after  comma 
nouns  after  comma 

3.  Status -affecting  occurrences  (bearing  on  choice  of  predicate  for 

each  clause) 

finite  verb 
copulative  verb 
infinitive  verb 

short  adjectives  and  participles 

other  predicates  (MOJKHO,  BOT,  BOH,  TAM,  T^T,  etc.) 

words  predicting  an  infinitive  (these  are  noted  because  they 

prevent  a subsequent  infinitive  from  being  interpreted  as  predicate) 
dash 

4.  Punctuation  marks 

5.  Miscellaneous 

antecedent-consequent  pairs 

incidentals  (if  between  commas,  prevent  commas  from  being 
interpreted  as  ending  a clause) 
comparative  adjectives  and  adverbs  (antecedents  of  HEM) 
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i Appendix  IV 

Example 

The  following  table  shows  the  construction  of  a temporary  profile. 
The  first  column  lls,ts_,  in  order^  the  words  of  a Russian  sentence^ 
taken  from  0*  i).  FEPHLITEhii,  COEfAlIilE  lO'-n-iHLiai’;.  T0:V:  'l  (/u\Am.  I'AVK  GCCP 
page  21 4^  lines  6-9"  The  second  column^  labeled  "Profile 
Skeleton"  gives  that  part  of  the  information  extracted  from  our 
'’temporary . choices"  which  is  pertinent  to  the  preparation  of  the 
temporary  profile.  Based  on  this  information_,  the  machine  assigns 
the  numbers  in  the  columns  headed  v,  P.  The  footnotes  to  the  table 
explain  some  of  the  rationale  behind  the  rules  which  the  machine  is 
following.  The  sumbol  0 in  the  Profile  Skeleton  indicates  "no  infor- 
mation", i.e.,  the  grammatical  interpretations  ("temporary  choices") 
pf  the  original  occurrence  do  not  contribute  any  information  to  the 
formation  of  the  temporary  profile. 


Occurrence 

Profile 

Temporary  Prlfile 

Skeleton 

C 

V 

P b 

1 

HO 

Coord,  Conj . 

0 

0 

0 0 

2 

Bf.lECTO  . 

Prep , 

0 

1 

.3 

Toro 

0 

0 

0 

1 

A 

Cma 

5 

HTOELi 

C.O.;  Einf.  pred.J 

1 

0 

0 

"6  ^ 

-iiCKATP- 

inf. 

1 

02 

0 

7 

TO')  HOE 

0 

1 

0 

8 

AJirEBPAKHECl'.'OE 

0 

1 

0 

9 

BHPAJKEHHJ:; 

0 

1 

0 

10 

iUia 

Prep . 

1 

2 

11 

0 

1 

2 

i2 

nPHEAW.¥EHH;'i 

0 

1 

X 

13 

ripM 

' prep . 

1 

3 

lU 

HOMOJJm 

0 

1 

3 

15 

iVLHOrOHJlEHOB 

0 

1 

X 

16 

/lAHHOli 

partlc . 

1 

X 

17 

GTi.riEHM 

0 

1 

02  , 

X 

18 

Cma 

19 

'm.) 

c.o. 

2 

0 

0 

20 

BOOEITE 

0 

■ 2 

0 

0 

21 

) 

Cma 

22 

KkK 

C.O, 

3 

0 

0 

23 

mi 

0 

3 

0 

0 
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Occurrence 

Profile 

Temporary  Profile 

Skeleton 

C 

V 

P b 

2k 

BHflEJm 

=>  Pred. 

3 

2 

0 

25 

, » 

Cma 

26 

HE 

! 

2 

0 

0 

27 

OCMJIECTBHMO 

partic.  short 

2 

u 

0 

neut.  sing. 

28 

9 

Cma 

29 

0 

0 

0 

0 

30 

.my 

pred. 

0 

2 

0 

31 

EblPAiKEHHE 

0 

0 

0 

32 

' i 

Cma. 

33 

aanaKutEECH 

Partic . 

0 

k 1 

3k 

BHOAHE 

0 

0 

X 

35 

T04HUM 

0 

0 

X 

36 

TOJItKO 

0 

0 

X 

37 

dJW 

Prep. 

0 

‘5  ‘ 

38 

BECK0HB4HHX 

0 

0 

5 

39 

CTEHEHEH 

0 

0 

2 

X 

ko 

• 

* 

Notes  pertaining  to  each  occurrence  q; 

1.  The  first  word  always  starts  C=0>  unless  It  Is  a subordinating 
conjunction.  In  this  way  the  first  clause  (which  Is  usually  the  main 
clause)  Is  numbered  0.  Since  this  word  Is  not  a predicate,  the  status 
of  Clause  0 Is  v=0.  Since  the  word  Is  not  a phrase  opener,  we  set 
P=0,  Indicating  that  this  word  Is  not  part  of  any  phrase. 

2.  A preposition.  We  are  still  in  clause  C=0,  and  have  not 
encountered  its  predicate  (i.e.  no  change  in  v).  This  word  opens  a 
(prepositional)  phrase,  therefore  P=l. 

3*  No  information  in  the  profile  skeleton.  The  word  following 
a preposition  must  be  part  of  the  same  prepositional  phrase  (unless  it 
is  Itself  a phrase  opener),  therefore  still  P=l. 

A catmna,  so  marked  in  the  profile  skeleton.  This  is  taken  to 
Indicate  the  end  of  phrase  1.  Furthermore,  since  phrase  1 was  not 
preceded  by  a comma,  the  present  comma  must  have  still  another  function, 
e.g*  mark  the  beginning  of  a new  clause  or  phrase,,  or  s^arate  the 
members  of  a listing.  Which  of  these  functions  is  present  will  be 
indicated  by  subsequent  occurrences. 
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5»  Clause  opener,  therefore  raise  C.  The  skeleton  profile  also 
Indicates  that  the  conjunction  4T0EL1  sometimes  takes  a predicate  In 

the  infinitive. 

6.  Verb  in  the  infinitive.  Possibly  a predicate  of  this  clause, 
therefore  set  v=l.  See  also  Note 

7 - 9»  No  changes  in  the  profile. 

10.  Preposition.  Start  phrase  2. 

11.  The  word  following  a preposition  Is  normally  part  of  the 
same  phrase,  therefore  P=l. 

i 12.  There  is  no  way  of  determining,  with  the  Information  of 
•the  profile  skeleton  alone,  whether  this  word  still  belongs  to  P=2. 

We  therefore  set  P=x.  (The  machine  program  will  resolve  this  untertalnty 

inPartll-B.) 

l ’ 

13*  Preposition.  Start  phrase  3* 

14.  See  11 

15,  •-  17.  See  12.  Note  that  the  participle  in  I6  is  not  preceded 
hy  a conm|a,  therfore  not  considered  a phrase  opener. 

18.  See  4.  r ^ 

- ( 

' 19.';  Clause  opener  starts  Clause  2.  (This  is  an  oversimplified 

example.  It  omits  in  the  first  iteration  the  complication  arising 
from  the  fact  that  the  wordHTO  may  have  other  functions  apart  from 
that  of  p conjunction.  Subsequent  iterations  may  resolve  such  initial 
ambiguities.) 

1^,  Np  change  in  "^e  profile  indicated. 

21.  See  4.  ' 

22.  Clause  opener,  starts  Clause  3*  As  far  as  we  know,  clauses  0 
and  2 ere  still  incomplete;  yet  the  word  KAK  calls  for  starting  a new 
clause. 

23.  See  20. 

24.  A finite  verb,  definitely  a predicate.  Set  v=2  for  clause  3» 

25.  See  4.  ■ 
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26.  No  clause  opener.  Since  the  comma  may  indicate  a change  in  C 
and  the  most  recent  clause  is  complete,  we  set  C equal  to  the  most  recent 
number  of  an  incomplete  clause,  i.e.  C=2  (since  v=0  for  clause  2).  This 
may  be  changed  if  a delayed  clause  opener,  such  as  a relative  pronoun, 
is  encountered  reasonably  soon. 

27*  A short-form  participle  in  the  neuter  singular.  This  may 
possibly  be  a predicate.  Set  v=l. 

28.  See  4. 


29 • See  26.  Clause  2 is  considered  complete  since  its  v=lj  so 
is  clause  1;  therefore  return  to  C=0. 

30.  See  24. 

31.  See  20. 

32.  See  4. 

33*  The  participle  following  a comma  indicates  a participial  phrase. 
This  is  sufficient  to  explain  the  preceding  comma,  and  therefore  the 
comma  is  not  considered  as  ending  the  previous  clause.  We  remain  in 
clause  0 and  start  phrase  4.  For  use  in  Part  II-B,  we  place  a backward 
flag,  to  seek  explanation  in  a foregoing  occurrence. 

34  - 36.  See  12.  Part  II-B  will  determine  how  many  occurrences 
following  33  belong  to  phrase  4. 


37 • Start  prepositional  phrase.  Although  in  reality  this  is  a 
phrase  within  a phrase,  namely,  part  of  the  participial  phrase  4, 
our  program  does  not  take  note  of  this  fact. 


38.  See  11. 

39.  See  12. 

, *) 

40.  Period,  end  of  sentence  . Check  whether  all  clauses  have 
found  their  predicates.  Clauses  1 and  2 had  found  "possible  predicates" 
marked  v=l;  these  are  now  changed  to  v=2,  since  no  inconsistency  has 
arise.  If  there  were  any  unsolved  difficulties  at  this  point,  such  as 
clauses  without  predicates  or  with  more  than  one  predicate,  we  would 
try  to  resolve  them  by  iterating  the  entire  profiling  process. 


*)  A separate  subroutine  will  determine  whether  a period  does  actually 
Indicate  the  end  of  the  sentence.  Moreover,  if  some  other  symbol  (?  or  l)  is 
used  for  the  purpose,  an  appropriate  signal  will  be  stored  for  use  in  Part  II-B. 
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